Long-term exposure to PM2.5 air pollution and mental health: a retrospective cohort study in Ireland

Background Mental illness is the leading cause of years lived with disability, and the global disease burden of mental ill-health has increased substantially in the last number of decades. There is now increasing evidence that environmental conditions, and in particular poor air quality, may be associated with mental health and wellbeing. Methods This cross-sectional analysis uses data on mental health and wellbeing from The Irish Longitudinal Study on Ageing (TILDA), a nationally representative survey of the population aged 50+ in Ireland. Annual average PM2.5 concentrations at respondents’ residential addresses over the period 1998–2014 are used to measure long-term exposure to ambient PM2.5. Results We find evidence of associations between long-term exposure to ambient PM2.5 and depression and anxiety. The measured associations are strong, and are comparable with effect sizes for variables such as sex. Effects are also evident at relatively low concentrations by international standards. However, we find no evidence of associations between long-term ambient particulate pollution and other indicators of mental health and well-being such as stress, worry and quality of life. Conclusions The measured associations are strong, particularly considering the relatively low PM2.5 concentrations prevailing in Ireland compared to many other countries. While it is estimated that over 90 per cent of the world’s population lives in areas with annual mean PM2.5 concentrations greater than 10 μg/m3, these results contribute to the increasing evidence that suggests that harmful effects can be detected at even low levels of air pollution.


Introduction
The World Health Organization (WHO) estimates that 4 million premature deaths every year are a result of ambient (outdoor) air pollution [1].The burden of disease attributable to air pollution is now estimated to be comparable with other major global health risks such as unhealthy diet and tobacco smoking, and was in the top five out of 87 risk factors for male and female deaths in 2019 [2].As a result, air pollution is now recognised as the single largest environmental threat to public health [1].Although air pollution has decreased in most European countries over the past two decades, including Ireland, levels of ambient air pollution remain above WHO guidelines in many cities and towns in Ireland [3].
Air pollution contains many individual pollutants, including particulate matter (PM), gaseous pollutants and metallic and organic compounds [4].While the European Union and international organisations such as the WHO issue guidelines in relation to multiple types of air pollution, exposure to fine particulate matter of 2.5 microns or less in diameter (PM 2.5 ) is considered to be particularly damaging to health [1,2].PM 2.5 particles can penetrate and lodge deep inside the lungs, and, along with ultrafine particles, may even enter the blood system affecting major organs [1].There is now strong causal evidence of associations between exposure to PM 2.5 and all-cause mortality, as well as acute lower respiratory infections, chronic obstructive pulmonary disease (COPD), ischaemic heart disease (IHD), lung cancer and stroke [1,[5][6][7].Recent systematic reviews have also shown strong evidence of associations between PM 2.5 and other health outcomes such as diabetes [8], infant health [9], cognitive functioning [10] and dementia [11].The available evidence also suggests that the health-damaging effects of PM 2.5 air pollution operate even at low exposure levels [1,5].As a result, the new WHO air quality guidelines (AQGs) are now set at levels of 5 μg/m 3 (annual mean) and 15 μg/m 3 (24-h mean), levels substantially below those that were in place before 2021 [1].
While the bulk of past research focuses on the effects of PM 2.5 on physical health and mortality, some recent research has also found evidence of associations between exposure to ambient PM 2.5 air pollution and mental health and wellbeing.For example, a recent systematic review of 22 studies supports the hypothesis that there could be an association between long-term PM 2.5 exposure (> = 6 months) and mental health outcomes including depression and anxiety [4].For depression prevalence, they report a pooled odds ratio of 1.102 per 10-μg/m 3 increase in PM 2.5 exposure (95% CI: 1.023, 1.189) (based on a meta-analysis of five studies).In addition, two of the included studies found statistically significant positive associations between long-term PM 2.5 exposure and the prevalence of anxiety symptoms above a threshold that was considered clinically relevant.Another meta-analysis of six cohort and two cross-sectional studies found a statistically significant relationship between long-term (> = 1 year) exposure to PM 2.5 (OR = 1.06, 95% CI: 1.00, 1.13 per 5 μg/m 3 increase) and depression prevalence [12].Other studies using a variety of statistical methodologies, as well as indicators for air pollution exposure, are suggestive of an association between ambient air pollution and other indicators of mental health and wellbeing such as the incidence of schizophrenia spectrum disorder [13], and the prevalence of suicide [14], anxiety [13,15], depression [13], bipolar disorder [16] and life satisfaction [17].
Despite this growing evidence base on the links between ambient air pollution and mental health, most of the previous studies used data on relatively shortterm exposures.For example, the systematic review by [4] identified 22 qualifying studies on this topic, but only five of these examined long-term exposures (defined as over six months).However, some hypothesised channels through which PM 2.5 might affect mental health are likely to operate over a longer exposure period [4,12,16], including: • Inflammation affecting the central nervous system; • Changes in stress responsivity, via hypothalamicpituitary-adrenal (HPA) axis activation; and • Adverse effects on cognitive development and dementia risk.
Focusing exclusively on short-term pollution exposures might miss or understate the impact of longer-term processes.In the present study, we have access to long-term residential histories for a large representative sample of people aged over 50 in Ireland.This cross-sectional analysis uses data on mental health and wellbeing from The Irish Longitudinal Study on Ageing (TILDA), a nationally representative survey of the population aged 50+ in Ireland.Annual average PM 2.5 concentrations at respondents' residential addresses over the period 1998-2014 are used to measure long-term exposure to ambient PM 2.5 .Since the available data are at individual level, we can also allow for a wide range of possible confounding socioeconomic factors.

Study population
The Irish Longitudinal Study on Ageing (TILDA) is a population-based, nationally-representative, longitudinal study of 8,504 community-dwelling adults in Ireland aged 50 and older and their partners of any age.The dataset contains a rich set of variables on the health and socioeconomic circumstances of older people.The study is harmonised with other international longitudinal studies of ageing, such as the Survey of Health, Ageing and Retirement in Europe (SHARE), the English Longitudinal Study on Ageing (ELSA) and the Health and Retirement Study (HRS) in the US.Baseline data collection took place in 2009-2011, and participants have been followed up at two-year intervals since then.Data are collected via a variety of modes, including computer-aided personal interviewing (CAPI), a self-completion questionnaire (SCQ) and a nurse-led health assessment (the latter carried out in Waves 1, 3 and 6) [18,19].Data from the CAPI and SCQ are used in this paper.
In this study, we use data from the self-completion questionnaire (SCQ) in Wave 3 (collected in 2014/2015) as the Wave 3 SCQ included a module on residential address history (n = 6,687).The questionnaire provided space for respondents to provide exact address details for up to ten locations where they have resided, starting with the most recent.A geocoded dataset of the responses to this questionnaire, supplemented with the current recorded addresses of study participants as collected as part of the primary TILDA interview, was provided to the researchers for the present study (see also Appendix 1).After deletion of those who were not age-eligible (i.e., aged less than 54 years of age) (n = 299) and those who did not complete the residential address history questionnaire (n = 1,714), 4,674 observations remained for matching with PM 2.5 data.As the data on ambient air pollution are available only from 1998 (see Data on ambient air pollution section), we select those respondents with a complete address history for each of the 17 years 1998-2014 inclusive, and assign the PM 2.5 concentration in their area in the relevant year to their address in that year (n = 4,021).Deletion of respondents with missing data on outcomes and/or covariates results in a final sample size of 3,407 respondents.Figure 1 details the study selection criteria in further detail.

Data on ambient air pollution
We use data on global estimates of annual average PM 2.5 concentrations at 0.01 degree resolution (approximately a 1km grid) between 1998 and 2014 to characterise ambient air pollution at TILDA respondents' addresses.These data are downloaded from [20] and described in [21].In essence, satellite sensors measure particulates blocking various wavelengths of light in a column of air.The concentration of PM 2.5 air pollution in each grid cell across the world is modelled by calibrating the sensor readings to reflect direct ground-based estimates in places where measurements are available and applying a chemical transport model of the atmosphere.
The main explanatory variable of interest ( E i ) is there- fore a proxy for long-run exposure to ambient air pollution.It was calculated by taking the arithmetic mean of the annual PM 2.5 concentrations ( C it ) in the 1 km grid square in which each TILDA respondent resided in each year for the 17 years prior to the collection of outcomes data (1998-2014): To protect respondents' confidentiality, we rounded the long-run concentration to the nearest 1 μg/m 3 . (1) The frequency distribution for this variable is shown in Table 1, with over half of the sample experiencing relatively low levels of exposure (5-7 μg/m 3 ).The sample mean value of the rounded variable is 7.67 and the standard deviation is 1.54.As indicated in Table 1, by far the most prevalent category is 7.We use this as the reference category in models with a categorical representation of PM 2.5 concentrations.
To illustrate the main spatial features of our PM 2.5 sample, Fig. 2 shows the annual maps for the start and end of the sample period.Concentrations generally fell over time.They also tended to be higher in eastern areas, particularly in the capital city, Dublin.This is consistent with the higher density of population and economic activity in the east, and the prevailing wind blows from the west.Some areas along the west coast, mainly in counties Mayo and Galway, were not included on the PM 2.5 maps and thus some TILDA respondents are omitted from the analysis due to unavailability of pollution data for them.

Data on mental health outcomes
Five indicators of mental health and wellbeing are examined in this study.Scores from the 8-item Center for Epidemiologic Studies Depression Scale (CES-D) are used to measure depression [22].This validated measure captures the frequency with which respondents report experiencing a series of depressive symptoms within the past week.The items included consist of statements such as 'I felt depressed' , 'I felt that everything I did was an effort' , 'my sleep was restless' , 'I enjoyed life' .These statements are presented to the respondents during the CAPI and the respondents indicate how often they experienced these feelings (rarely or none of the time (less than 1 day), some or a little of the time (1-2 days), occasionally or a moderate amount of time (3-4 days), or all of the time (5-7 days)).The total number of positive and negative responses to each item are summed to obtain an overall score (answers to positive statements are reverse coded).Higher scores indicate increased depressive symptomology.
Anxiety is measured using the 7-item Anxiety subscale of the Hospital Anxiety and Depression scale (HADS-A), administered to respondents during the CAPI [23].Items include 'I felt tense or wound up' , 'Worrying thoughts go through my mind' and respondents are asked to indicate how often they felt this way during the past week ('most of the time' , 'a lot of the time' , 'from time to time, occasionally' , 'not at all').As with the CES-D, the total number of positive and negative responses to each item are summed to obtain an overall score (answers to positive statements are reverse coded), with higher scores indicating increased anxiety.The 8-item Penn State Worry Questionnaire is included in the SCQ [24].The items include statements such as 'my worries overwhelm me' , 'many situations make me worry' , 'I know I should not worry about things, but I just cannot help it' .The respondents are asked to indicate how typical or characteristic each statement is on a fivepoint scale, from 1 'not at all typical' to 5 'very typical' .
Responses to each item are then summed to obtain a total score, ranging from 8 to 40 (with some evidence of 'bunching' at even scores; see Fig. 2).Higher scores indicate a higher level of worry.
The 4-item version of the Perceived Stress Scale (PSS) is used to record stress in the SCQ.The PSS consists of four questions that asks respondents to indicate how they felt in the past month, with answers on a 5-point Likert scale from 0 (Never) to 4 (Very often).A sample item is 'how often have you felt that you were unable to control the important things in your life?' .The range of the PSS score is [0,16], with a higher score indicating higher levels of perceived stress.Although the 4-item version of the PSS asks about how the respondents felt in the past month, it can be used as a measure of chronic stress associated with generalised stress perception and can reflect how unpredictable, uncontrollable, and overloaded an individual's life is [25].
Quality of life is an important measure of overall wellbeing and it is measured using the 12-item Control-Autonomy-Self-Realisation-Pleasure (CASP) scale covering four domains: control (the ability to actively participate in one's environment), autonomy (the right of the individual to be free from unwanted interference of others), self-realisation (the fulfilment of one's potential) and pleasure (the sense of happiness or enjoyment derived from engaging with life).The items included in those domains consist of statements such as: 'I look forward to each day' , 'my health stops me from doing the things I want to do' , 'I feel that life is full of opportunities' .These statements are presented to the respondents in the SCQ and they are asked to indicate how often (often, sometimes, not often or never) they feel each statement applies to their life.The overall score is obtained by summing each item and higher scores denote better quality of life (answers to the negative statements are reverse coded) [26].Figure 3 illustrates the frequency distributions for each of the five outcome variables considered in this study.

Covariates
A variety of individual-level covariates are included in the statistical models to take account of potentially confounding socio-demographic, socio-economic, health and behavioural characteristics.Tables 2 and 3 present summary statistics for each independent variable.In addition to controls for age, sex, marital status and socioeconomic status (proxied by employment status and highest level of education completed), controls are added for health status and health behaviours.As there is no universal access to public healthcare in Ireland, an indicator for medical card status (which grants those on low incomes access to free public healthcare) is also added to further proxy for health need.

Statistical methods
As described above, the PM 2.5 data are matched to residential address history data from TILDA, which in combination with detailed data from TILDA on a variety of mental health outcomes and important confounders, allows for the specification of regression models that estimate the association between long-term exposure to PM 2.5 air pollution and mental health.
For each of the outcome variables, two variations of the variables are modelled: ordinal scales and threshold-based metrics.To assist comparability, each of the metrics based on an ordinal scale is Z-standardised.This involves dividing each score's deviation from the sample mean by the sample standard deviation.The scales transformed in this manner are the CES-D  As an alternative to the linear specification of pollution effects, we also estimate models with a categorical representation of annual PM 2.5 concentrations (see Table 1).In these specifications, dummy variables are included to indicate observations with rounded average exposures at each step of 1 μg/m 3 , with 7 μg/m 3 regarded as the reference category. (2) In some cases, being in the upper tail of a scale's distribution can provide more clinically relevant information.Metrics in this category aim to detect risk of depression (CES-D score > = 9), Anxiety (HADS-A score > = 11) and the Penn Generalized Anxiety Disorder indicator (Worry scale > = 22).These threshold-based measures are modelled as binary (1/0) variables using logit regression.We express the results from these regressions as odds ratios.

Results
Table 4 summarises the standardised coefficients and confidence intervals for the five indicators modelled in this paper.The scales for depressive symptoms and anxiety show strong positive associations with long-term average residential PM 2.5 concentrations, with p-values of less than 1 per cent.There is little evidence that any of the other indicators (worry, stress or quality of life) are associated with particulate pollution levels.Full regression results for the models of depressive symptoms and anxiety are in Tables 6 and 7 in Appendix 2. As shown in these tables, estimated PM 2.5 effects do not differ substantially between univariate models and versions that adjust for a range of potential confounding factors.
To further explore the relationships between PM 2.5 and the depressive symptoms and anxiety indicators, we re-estimate the models using a categorical representation of PM 2.5 rather than assuming linearity.The pollution coefficients are illustrated for both models in Fig. 4, and the full regression results are included as Tables 8 and 9 in Appendix 2. These figures reinforce the impression that higher PM 2.5 exposures are associated with higher risk of depressive symptoms and anxiety.Indeed, in the case of depressive symptoms the relationship appears strikingly linear, at least above the reference category of 7 μg/m 3 .Information criterion tests tend to favour the linear PM 2.5 specification over the categorical PM 2.5 specification; for example, the linear depressive symptoms model has an AIC of 9,301 and a BIC of 9,393 compared to an AIC of 9,307 and a BIC of 9,436 for the categorical version.In Table 5, we investigate whether the results (with a linear specification of PM 2.5 ) hold when the outcome variables for depression, anxiety and stress are expressed using threshold values.While there is now no evidence that long-term PM 2.5 concentrations are associated with the binary indicator of clinically-significant depressive symptoms, the p-value for anxiety suggests an effect of PM 2.5 on clinically-significant anxiety.

Discussion and conclusions
Mental illness is the leading cause of years lived with disability, and the global disease burden of mental ill-health has increased substantially in the last number of decades [27].There is now increasing evidence that environmental conditions, and in particular poor air quality, may be associated with mental health and wellbeing.We find evidence of associations between long-term exposure to ambient PM 2.5 and validated indicators of depressive symptoms and anxiety for a large sample of over-50s in Ireland.Allowing for a range of potential confounding factors (age, sex, employment status, marital status, long-term health limitations, alcohol consumption problems, smoking status, polypharmacy and entitlement to free public healthcare) does not substantially affect these findings.
The measured associations are strong, particularly considering the relatively low PM 2.5 concentrations prevailing in Ireland compared to many other countries.While it is estimated that over 90 per cent of the world's population lives in areas with annual mean PM 2.5 concentrations greater than 10 μg/m 3 [21], these results contribute to the increasing evidence that suggests that harmful effects can be detected at even low levels of air pollution.To illustrate the strength of these relationships in our sample, note that moving from the reference category to the highest average PM 2.5 exposure in our sample (7 to 12 μg/m 3 ) involves an increase of 5 μg/m 3 .Multiplying the PM 2.5 coefficient in the depressive symptoms model by 5 implies an increase of 16.2% of a standard deviation on the CES-D scale.This scale of effect is broadly comparable to the higher depressive symptom score among females (16.9%) compared to males and it is larger than the marginal effect of being in the subsample taking 5+ medications (13.6%), as shown in the full regression results for the depressive symptoms model (Table 6 in Appendix 2).
We find no evidence of associations between long-term ambient particulate pollution and other indicators of mental health and well-being: stress, worry and quality of life.Understanding why long-term PM 2.5 concentrations are associated with depression and anxiety, but not other indicators of mental health and wellbeing, is challenging and worthy of further research.It is possible that different dimensions of mental health may be more or less influenced by the length of exposure, the specific type of pollutant and/or omitted confounding variables.We are aware of just one study [15] that investigate the impact of differing lengths of exposure to PM 2.5 (and PM 10 pollution) on mental health; using data from the US Nurses' Health Study (age range 57-87), they found that exposure to fine particulate matter (PM 2.5 ) was associated with higher symptoms of anxiety, with more recent exposures potentially more relevant than more distant exposures.However, these results are not directly comparable with the results in this study given the substantial difference in the study populations of interest.
While we have been able to control for many potential confounding factors at individual level, this is a crosssectional study so it is not possible to draw conclusions about causality.Ideally, repeated measurements of mental health would be available for the 17-year period for which we have PM 2.5 concentration data; in the absence of such data, this study adopted a cross-sectional approach investigating the link between 17-year annual average PM 2.5 concentrations and mental health and wellbeing, measured in 2014/2015.In addition, pollution exposures were not randomly assigned to respondents, so there may have been some selection away from polluted areas among those able to afford better environments or those particularly affected by air pollution.Future work could exploit 'natural experiments' , such as policy changes, to identify the causal impacts of air pollution on mental health.See [27] for an application using data from the China Health and Retirement Study (CHARLS), a sister study of TILDA.
Measured effects may also have been influenced by omission of potentially important correlated factors such as other air pollutants or noise exposure [15].The WHO note that in everyday life, individuals are exposed to a mixture of air pollutants that varies in space and time [1].It is therefore possible that the association we observed for long-term PM 2.5 is attributable, in whole or in part, to a correlation between PM 2.5 and another exposure.For example, [16] find a large and statistically significant positive association between average annual ambient local PM 2.5 concentrations in 2010 and a broad indicator of depression based on self-reported symptoms of nerves, anxiety, tension or depression (using data on adults aged 40-69 from the UK Biobank).The odds of reporting one or more of these symptoms is reported to be 2.31 (95% CI: 2.15-2.50)times higher per 10 μg/m 3 increase in PM 2.5 .Positive associations are also reported for indicators of major depression and bipolar disorder.The authors also find an independent association between mental health outcomes and a modelled proxy for road traffic noise, highlighting the difficulty in assessing the independent effects of different pollutants associated with a common source (i.e., road traffic) on mental health.Conversely, [13] find that the significant results of traffic noise on mental health are attenuated when adjusting for other types of pollution (such as PM 2.5 ), but that the significant effects of PM 2.5 on the hazard rate of schizophrenia spectrum disorder, anxiety and depression were only slightly reduced when adjustment was made for the other exposure variables such as traffic noise.Unfortunately, data availability on other exposures is limited for our sample, particularly for historical periods.
Other limitations include the fact that PM 2.5 annual average concentrations are rounded to the nearest 1 μg/m 3 (a condition of data access to protect respondents' anonymity), which reduces the variation in PM 2.5 concentrations across time and space.In addition, while ambient PM 2.5 concentrations are most commonly used in studies of this kind, personal exposure is influenced by the different microenvironments or activities an individual experiences (e.g., time in traffic, indoor sources, second-hand tobacco smoke, occupational exposure, and degree of penetration of ambient air pollution into homes, etc.) [28] and is much more difficult to measure.For future work of this kind, it would be particularly useful also to be able to match historical exposures dating back to early childhood to later life mental health outcomes, via complete address histories.As density of ground sensor networks improves (at least in developed countries), more granular exposure estimates should also provide greater sample variation.As in many other applications in the literature, individual-level exposure to PM 2.5 is calculated using land-use regression models to determine approximate annual concentrations at study participants' residential addresses.While individuals' activity patterns also increase exposure misclassification, alternative methods such as using distance from major roads [4] and self-reported measures [29] are problematic, and personal exposure monitoring remains prohibitively expensive for large-scale studies [4].

Data on historical addresses
As part of Wave 3 data collection for The Irish Longitudinal Study on Ageing (TILDA), participants were asked to report the addresses at which they have lived throughout their lives.The residential history module (RHM) was administered as part of the self-completion questionnaire (SCQ) and asked participants the following: "Where did you previously live?Please start with the most recent previous address first and then the second most recent, and so on".The questionnaire provided space for respondents to provide exact address details for up to ten locations where they have resided.A geocoded dataset of the responses to this questionnaire, supplemented with the current recorded addresses of study participants as collected as part of the primary TILDA interview, was provided to the researchers for the present study (n = 4,674).
An extensive data cleaning exercise was undertaken to ensure the validity of the residential history information used.The primary goal of the data validation exercise was to create a respondent-year panel dataset detailing the location at which each respondent resided between 1998-2014.An overview of the steps taken to generate this dataset is provided below:

Preliminary data cleaning
We first conducted an initial data cleaning exercise in which clear errors in the data were identified and rectified.This exercise included a check for duplication of respondents and the identification of cases where the stated duration of residence was unreasonable (e.g.negative).We also removed geographic coordinates that had been assigned to address locations that did not correspond to a valid location in either the Republic of Ireland or Northern Ireland.While all those included in the baseline interview in 2009/2010 were residents of the Republic of Ireland, those who subsequently moved to Northern Ireland (but not other countries) were followed up for interview.

Classify respondents' engagement with the RHM questionnaire
Since the RHM was contained in the TILDA SCQ, participants were not compelled to complete the questionnaire fully.Consequently, there was significant variation in the extent to which participants engaged with the exercise.Identifying individuals who did not engage at all was essential, as any observed data pertaining to these individuals could not have been derived from the RHM.These data were instead derived from the TILDA current address dataset, which did not require further validation as it was not subject to the same likelihood of reporting errors as the RHM.We thus restricted further data cleaning and imputation steps to individuals identified as giving some information beyond their current address in the RHM.

Improve location quality
Not all reported addresses had been assigned an exact geographic location in the version of the RHM data provided to the researchers.We employed several strategies to assign the most detailed geographical identifier possible to incomplete addresses.In many cases, we could identify the county (one of twenty-six in the Republic of Ireland) to which an incomplete address pertained.To assign addresses to the appropriate county, we employed the following strategies: A. We developed a correspondence table to assign incomplete but known address entries to the appropriate county.B. Where respondents indicated that they have always lived at the same address or identified an additional period during which they lived at their current address, we assigned the geo-location of their current address to that entry.C. We performed an automated text search algorithm to the remaining incomplete addresses, allowing us to match some of the remaining entries to the relevant county.

Deal with invalid year-of-residence information
In some cases, respondents reported locations where they have lived but did not detail the years during which they resided at these addresses.Such cases were necessarily excluded from our analysis.We also took several actions to resolve cases where it appeared that the years that respondents reported living at an address were incorrect.First, where the order of years was backwards on the RHM form, we assumed this occurred because of a genuine error and reversed the order accordingly.Second, where respondents' age indicated that a reported address was inhabited before their birth, we adjusted the entries as follows: If the respondent's sequence of address entries suggested that they 'moved out' of the address before their date of birth, we removed the entry entirely.If the sequence of entries indicated that they moved away from the location after birth, we assumed that the location corresponded to the respondent's address from their approximate year of birth.

Geocode validation
The geographic coordinates assigned to each RHM address were not generated as part of the present analysis.We carried out a set of verification exercises to ensure that these coordinates referenced exact residential addresses rather than the centroids of any larger geographic unit.We assigned each set of address coordinates to several official geographic administrative units containing the coordinates and calculated the distance from that address to the unit's centroid.Specifically, we linked each set of coordinates to its county, electoral district, and small area administrative unit as designated by the Central Statistics Office of Ireland.The exercise revealed no significant concerns that the reported coordinates referred to administrative areas rather than exact addresses.

Create a respondent-year panel of address data
We transformed the RHM response dataset into a respondent-year panel of addresses in which we assigned the appropriate address to each year between participants' estimated year of birth (based on age at the time of interview) and their interview year.Frequently, we observed overlaps in the dataset such that more than one address may be assigned to a given year.We took the following approach to address this issue: First, we created a distance matrix between all the addresses each participant reported over their lives and extracted the straight line distance between conflicting addresses.If that distance was zero, we removed the conflict.For the remaining conflicts, we adopted the practice of taking the first address in the conflict series.This equated to using the address reported first in the RHM questionnaire.The underlying logic was that the information reported first was likely to be the most accurate.We applied one exception to this rule: Where an address within the Republic of Ireland conflicted with an indicator that the respondent was living abroad at the time, we coded the respondent as having been living abroad.A typical migration pattern among this cohort was to have spent a spell living abroad before returning to live in the Republic of Ireland.Since this usually constitutes a significant life event, we assume that respondents will recall it accurately.

Combine all valid data to maximise sample size
Many respondents (including those who do not engage with the RHM at all) will have lived at their current address throughout the period of analysis in this paper (1998-2014).As such, we could supplement the information from the RHM with information on respondents' current addresses as recorded as part of the primary TILDA interview.In the Wave 3 faceto-face interview, participants were asked: For how many years have you lived at this address?We used the responses to this question to construct a respondentyear panel, similar to that created from the RHM data, that ran from the first year respondents reported being at that address up to the year of interview.The combined RHM and current address panels (for n = 4,497 respondents) are then used for linking with historical air pollution data.See also Fig. 1.

Appendix 2
Full model results

Fig. 1
Fig. 1 Flowchart of study selection criteria

Fig. 3
Fig. 3 Sample frequency distributions for mental health and well-being indicators (n = 3,407)

Fig. 4
Fig. 4 Coefficients on rounded PM 2.5 exposure categories in fully adjusted OLS models of CES-D depressive symptoms Z-score and HADS-A anxiety Z-score; reference category = 7 μg/m 3

Table 1
Descriptive statistics for categorical PM 2.5 exposure variable, rounded to nearest 1 μg/m3

Table 2
Descriptive statistics for continuous explanatory variables Depressive symptoms scale, the HADS-A Anxiety scale, the Penn State Worry scale, the Perceived Stress scale and the CASP Quality of Life scale.Details of these scales are given in Data on mental health outcomes section.Coefficients in the Z-standardised models show how many standard deviations difference in the dependent variable are associated with a unit change in a given explanatory variable.Equation 2 illustrates the model, with the Z-standardised outcome for individual i in 2014-2015 (H i ) explained as a linear function of a constant, an estimate of the individual's long-term exposure to ambient pollution for the previous 17 years ( E i , discussed in Data on ambient air pollution section), a vector of socioeconomic controls X i (see Covariates section) and a random error term ε i .

Table 3
Descriptive statistics for categorical sociodemographic variables

Table 4
Summary of linear coefficients on PM 2.5 exposure (μg/m 3 ) in models of Z-standardised mental health scales with full set of controls

Table 5
Summary of odds ratios on PM 2.5 exposure (μg/m 3 ) in models of binary mental health metrics with full set of controls

Table 6
OLS regression results for Z-standardised CES-D depressive symptoms scale using linear rounded long-term average PM 2.5 concentration

Table 7
OLS regression results for Z-standardised HADS-A anxiety scale using linear rounded long-term average PM 2.5 concentration

Table 8
OLS regression results for Z-standardised CES-D depressive symptoms scale using categorical rounded long-term average PM 2.5 concentration

Table 9
OLS regression results for Z-standardised HADS-A anxiety scale using categorical rounded long-term average PM 2.5 concentration